The procedure using sample correlation to derive the F-test
Consider multivariate regression:
First, we know from straight line that:
This actually follows from
Since:
(这个公式最后一个等号有个typo)
Example problem:
Method of Orthogonal Projections
####
Method of Lagrange Multipliers
Regression with orthogonal predicted variables
Regression hypothesis testing
Matrix algebra cheatsheet
Diagonostic tools and model selection
High leverage point will cover the evidence of an outlier
Straight line regression 的一些结论
Straight line prediction interval and band
Introducing new explanatory variables
Key step to prove this theorem is by orthogonalizing the model.
Generalized least square
High leverage points
DFFITS
Cook’s distance
Design matrix of less than full rank
There are two ways to solve this. One is to drop some variables like $\alpha_1$ and $\gamma_1$, since we cannot estimate so many variables. Another is to expand the design matrix like this.
Notice that it we add two more observations in our data and our goal is to enforce that $\alpha_1+\alpha_2 = 0$ and $\beta_1+\beta_2 = 0$. The estimated coefficient using the expanded design matrix will satifsy the two constraints. Why this works?
Intuitively, now we have n+2 data points, and we can do as better as we have done for the first n-1 points by having the two constraints and make the error in the remaining 2 data points to be zero.
Put it in another way. Previously, you need to think how to take values of $\alpha_1-\alpha_2$ to make the error smallest, now, we want as well minimize the error due to the last two obeservations and luckily we find we can solve them explicitly to reduce the error to 0.
multicollinearity in regression
Suppose you are fitting a regression model with two highly correlated explanatory variables. When you fit it use simple linear regression model:
We can see both term are not significant. But if we fit them separately, we can find:
both terms appear significant.
Intuitively, when we test if a term is significant or not, our null hypothesis is setting the coefficient before that variable to be 0 and refit the model to see whether the model performs worse or not. If the model performs very bad, we can conclude that variable is very important. But in our situation. two variables are highly correlated, no matter which one you remove, the other one can still explain the response quite well, so if we include both in the model, both terms appear insignificant.
Assumption on error term
Some properties:
under A2, an unbiased estimator of $\sigma^2$ is $|Y-X \widehat{\beta}|^{2} /(n-p)$
The LSE and the residual vector is always uncorrelated.
Under A3,
####
Fisher information of the regression parameters
Weighted least squares estimators
Gauss-Markov Theorem
Random effect model
The restricted maximum likelihood estimators
Qualify exam 2019
- (a) 问是test linear combination,用F test 或者t test,F test 分子上是RSSH - RSS, 分母是$S^2$。
- (b) 问要知道$Y_{n+1}$和$\hat{Y}$ 是独立的,方差是他们各自方差之和。
- (c)问我们根据(b)问可以推得$Y_{n+1}$的分布。
- (d)问是add more explanatory variables 的问题,残差可以用$(I_n - P)X$写出。
- (e)问用的公式是这个。
Qualify exam 2018
- (a)问用到了 regression with orthogonal predicted variables
(b)问用到了展开$(Y-X\hat{\beta})’(Y-X\hat{\beta})$
(c)问refer to 这里
- (d)问用(a)问的结论,因为X1 和 X2是orthogonal。
- H indicates leverage, residual indicates outliers and DFFITS indicates the influential point.
Qualify exam 2017
(a)问这里要test $\beta_1 = 0$, 这里我们只有sample correlation information。we refer to here
基本思路是$ r^2 = R^2 = \frac{RSS_{H_0}-RSS}{RSS_{H_0}}$
(b)问model diagnostic的基本步骤是independence, constant variance, and normality.
(e)问ncp的计算方法:
$Y \sim N(\mu,\sigma^2)$ , $(\frac{Y-\mu}{\sigma})^2 \sim \chi_1(a)$, $a = \mu^2/ \sigma^2$
Qualify exam 2016
- (f) 问如果不想重新算一遍的话,可以记一下
Qualify exam 2012
(a): the design matrix are highly correlated, so if our hypothesis is $\beta_1 = 0$ or $\beta_2 = 0$, under these hypothesis, the model will both be a perfect fit. So both $\beta_1$ and $\beta_2$ are not significant. $\beta_0$ Is significant. To calculate the coefficients, just use do LSE.
(b) i. Due to multicollinearity.
Qualify exam 2013
- (a) MLE of $\beta$ is the same as LSE, and MLE of $\sigma^2$ is $Y^{‘}NY / n$,
- (c),(d): larger power means smaller variance of $\hat{\beta}$.
- (c): use
(e): To minimize C(k,$\lambda$), we could first do C(k+1,$\lambda$)-C(k,$\lambda$), if <0, continue. Then we can find C(k+1,$\lambda$)-C(k,$\lambda$) is exactly $Z_i-\ \lambda$。Then $\hat{k}$ will be some kind of change point.
Qualify exam 2014
- (a),(b): add sum to zero constraint.
- (c): fisher-lsd test
- (d) use F = (RSSH0 - RSS) / RSS to test main effect and also interaction effect
- (e): no, the condition that F test can be used is that the numerator chi square random variable is independent of the denominator’s chi square random variable. The pair like trt and trt+bWeight can form a F test, since RSSH0 - RSS / RSS is a F distribution. But other pairs like trt and bWeight can not achieve this.
- (g): it is a prediction of a new sample, the variance should be $x\hat{\sigma^2}(X^{‘}X^{-1})x^{‘}$
- (h): see if there is multicollinearity
Qualify exam 2015
(a): 这里需要一些技巧构造X,需要对参数就行一些reparamization。
这里先固定住第一第二第四列,然后求第三列让他和其余列全部正交。
- (c): 死算,用orthogonal的性质消去一些项。